منابع مشابه
Recognition of Superimposed Caption
The automatic extraction and reading of news captions and annotations can be of great help locating topics of interest in digital news video archives. To achieve this goal, we present a technique, called Video OCR, which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method by which Video OCR operates, and suggest applications for...
متن کاملImage2Text: A Multimodal Caption Generator
In this work, we showcase the Image2Text system, which is a real-time captioning system that can generate human-level natural language description for any input image. We formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different from...
متن کاملCross-Lingual Image Caption Generation
Automatically generating a natural language description of an image is a fundamental problem in artificial intelligence. This task involves both computer vision and natural language processing and is called “image caption generation.” Research on image caption generation has typically focused on taking in an image and generating a caption in English as existing image caption corpora are mostly ...
متن کاملTopic-Specific Image Caption Generation
Recently, image caption which aims to generate a textual description for an image automatically has attracted researchers from various fields. Encouraging performance has been achieved by applying deep neural networks. Most of these works aim at generating a single caption which may be incomprehensive, especially for complex images. This paper proposes a topic-specific multi-caption generator, ...
متن کاملReview Networks for Caption Generation
We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoderdecoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nature
سال: 2002
ISSN: 0028-0836,1476-4687
DOI: 10.1038/420359d